-
Notifications
You must be signed in to change notification settings - Fork 495
UCT/CUDA_IPC: Enforce host memory support for mem_type EP #10933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
|
||
| if ((getpid() == *(pid_t*)params->iface_addr) && same_uuid && | ||
| !iface->config.enable_same_process) { | ||
| if ((getpid() == *(pid_t*)params->iface_addr) && same_uuid) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now we will always prevent cuda_ipc from same process. is that what we want? if not maybe:
if ((getpid() == *(pid_t*)params->iface_addr) && same_uuid) {
return uct_iface_scope_is_reachable(tl_iface, params);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually this block will be removed all together but as this PR is not yet ready, some things is not commited
WalkthroughIntroduces host memory-awareness for UCP memory-type endpoints, implements a per-endpoint CUDA flush mechanism with stream event tracking, removes the enable_same_process configuration option from CUDA IPC, updates endpoint flush bindings across CUDA transports, and adjusts tests accordingly. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant EP as CUDA EP
participant Flush as Flush Handler
participant Streams as Active Streams
participant Completion
Caller->>EP: uct_cuda_base_ep_flush()
alt No active streams
EP-->>Caller: UCS_OK (immediate)
else Active streams exist
EP->>Flush: Allocate base flush descriptor
loop For each active stream
EP->>Streams: Allocate per-stream flush descriptor
Streams->>Streams: Register callback on stream event queue
EP->>EP: Increment shared stream_counter
end
EP-->>Caller: UCS_INPROGRESS
par Stream Processing
Streams->>Streams: Process stream events
Streams->>Flush: Invoke completion callback when counter reaches zero
Flush->>Completion: Call user completion
Flush->>Flush: Free flush descriptor
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
Only last commit is relevant (other commits are merged from another branch)
What?
Enforce host memory support for mem_type EP
Why?
Prevent cuda_ipc from being selected for mem_type EP
Summary by CodeRabbit
Release Notes
Improvements
Configuration Changes
ENABLE_SAME_PROCESSconfiguration option from CUDA IPC, simplifying deployment setup.